Racial Disparity in Natural Language Processing: A Case Study of Social Media African-American English

نویسندگان

  • Su Lin Blodgett
  • Brendan T. O'Connor
چکیده

We highlight an important frontier in algorithmic fairness: disparity in the quality of natural language processing algorithms when applied to language from authors of di‚erent social groups. For example, current systems sometimes analyze the language of females and minorities more poorly than they do of whites and males. We conduct an empirical analysis of racial disparity in language identi€cation for tweets wriŠen in African-American English, and discuss implications of disparity in NLP.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Demographic Dialectal Variation in Social Media: A Case Study of African-American English

Though dialectal language is increasingly abundant on social media, few resources exist for developing NLP tools to handle such language. We conduct a case study of dialectal language in online conversational text by investigating African-American English (AAE) on Twitter. We propose a distantly supervised model to identify AAE-like language from demographics associated with geo-located message...

متن کامل

Challenges of studying and processing dialects in social media

Dialect features typically do not make it into formal writing, but flourish in social media. This enables largescale variational studies. We focus on three phonological features of African American Vernacular English and their manifestation as spelling variations on Twitter. We discuss to what extent our data can be used to falsify eight sociolinguistic hypotheses. To go beyond the spelling lev...

متن کامل

The Creation of an Intercultural Learning Experience in EFL Contexts

The present Study aimed to examine the efficacy of using literary texts in promoting intercultural communication competence, and intercultural awareness and understanding within language teaching contexts. The participants were 50 Iranian undergraduate students of English Literature, 20 male and 30 female, with their ages ranging from 19 to 24 engaged in reading and discussing literary texts wi...

متن کامل

A Supervised Method for Constructing Sentiment Lexicon in Persian Language

Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...

متن کامل

Social Identity Theory in Toni Morrison’s Sula

The concept of identity and its formation is one of the most basic notions in the field of social psychology. Many psychologist and sociologists have presented their theories based on this concept and the psychosocial progress of its formation in social contexts. Henry Tajfel, a prominent social psychologist, in his Social Identity Theory has divided an individual’s identity into two parts: “pe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1707.00061  شماره 

صفحات  -

تاریخ انتشار 2017